187 research outputs found
RDF Knowledge Graph Visualization From a Knowledge Extraction System
In this paper, we present a system to visualize RDF knowledge graphs. These
graphs are obtained from a knowledge extraction system designed by
GEOLSemantics. This extraction is performed using natural language processing
and trigger detection. The user can visualize subgraphs by selecting some
ontology features like concepts or individuals. The system is also
multilingual, with the use of the annotated ontology in English, French, Arabic
and Chinese
LiteMat: a scalable, cost-efficient inference encoding scheme for large RDF graphs
The number of linked data sources and the size of the linked open data graph
keep growing every day. As a consequence, semantic RDF services are more and
more confronted with various "big data" problems. Query processing in the
presence of inferences is one them. For instance, to complete the answer set of
SPARQL queries, RDF database systems evaluate semantic RDFS relationships
(subPropertyOf, subClassOf) through time-consuming query rewriting algorithms
or space-consuming data materialization solutions. To reduce the memory
footprint and ease the exchange of large datasets, these systems generally
apply a dictionary approach for compressing triple data sizes by replacing
resource identifiers (IRIs), blank nodes and literals with integer values. In
this article, we present a structured resource identification scheme using a
clever encoding of concepts and property hierarchies for efficiently evaluating
the main common RDFS entailment rules while minimizing triple materialization
and query rewriting. We will show how this encoding can be computed by a
scalable parallel algorithm and directly be implemented over the Apache Spark
framework. The efficiency of our encoding scheme is emphasized by an evaluation
conducted over both synthetic and real world datasets.Comment: 8 pages, 1 figur
On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark
Querying very large RDF data sets in an efficient manner requires a
sophisticated distribution strategy. Several innovative solutions have recently
been proposed for optimizing data distribution with predefined query workloads.
This paper presents an in-depth analysis and experimental comparison of five
representative and complementary distribution approaches. For achieving fair
experimental results, we are using Apache Spark as a common parallel computing
framework by rewriting the concerned algorithms using the Spark API. Spark
provides guarantees in terms of fault tolerance, high availability and
scalability which are essential in such systems. Our different implementations
aim to highlight the fundamental implementation-independent characteristics of
each approach in terms of data preparation, load balancing, data replication
and to some extent to query answering cost and performance. The presented
measures are obtained by testing each system on one synthetic and one
real-world data set over query workloads with differing characteristics and
different partitioning constraints.Comment: 16 pages, 3 figure
Conflict Ontology Enrichment Based on Triggers
International audienceIn this paper, we propose an ontology-based approach that enables to detect the emergence of relational conflicts between persons that cooperate on computer supported projects. In order to detect these conflicts, we analyze, using this ontology, the e-mails exchanged between these people. Our method aims to inform project team leaders of such situation hence to help them in preventing serious disagreement between involved employees. The approach we present builds a domain ontology of relational conflicts in two phases. First we conceptualize the domain by hand, then we enrich the ontology by using the trigger model that enables to find out terms in corpora which correspond to different conflicts
Identifying Conflicts Through Emails by Using an Emotion Ontology
International audienceIn the logic of text classification, this paper presents an approach to detect emails conflict exchanged between colleagues, who belong to a geographically distributed enterprise. The idea is to inform a team leader of such situation, hence to help him in preventing serious disagreement between team members. This approach uses the vector space model with TF*IDF weight to represent email; and a domain ontology of relational conflicts to determine its categories. Our study also addresses the issue of building ontology, which is made up of two phases. First we conceptualize the domain by hand, then we enrich it by using the triggers model that enables to find out terms in corpora which correspond to different conflicts
- …